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Abstract 

The Environmental Systems Research Institute (ESRI) geographic information system (GIS) 
in use at Goddard Space Flight Center provides users with the means of combining remote sens- 
ing data with ancillary data (soils maps, geologic maps, topographic maps, etc.) and perforajing 
qualitative analyses on the resulting multivariable data base. However, statistical techniques such 
as multiple regression, analysis of variance and spatial autocorrelation analyses are not available 
in the GIS. This paper describes interfaces between ESRFs GIS data files and real valued data 
files written to facilitate statistical analysis and display of spatially referenced multivariable data. 
An example of data analysis which utilized the GIS and the Statistical Analysis System (SAS) is 
presented to illustrate the utility of combining the analytic capability of a statistical package 
with the data management and display features of the GIS. 
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INTERFACES BETWEEN STATISTICAL ANALYSIS PACKAGES 
AND THE ESRI GEOGRAPHIC INFORMATION SYSTEM 


INTRODUCTION 

Automated geographic information systems provide a framework in wliich spatially referenced 
data (maps and remote sensing imagery) may be manipulated, displayed and analyzed (Knapp, 
1979). The utility of a geogmphic information system (GIS) lies in its ability to create and anal- 
yze a multivariable file derived from maps and images of a study area, Geograpliic information 
systems have been used by land planners to select sites for campgrounds in Canadian national 
parks (Arbour, 1980), by researchers at the National Cancer Institute to study patterns of mortal- 
ity (Mason, 1980) and by geologists to examine relationships between petrologic and geophysical 
information on the moon (Andre et al., 1977), 

The Environmental Systems Research Institute (ESRI) GIS provides the user with a tool for 
creating multivariable files from digitized map data and digital remote sensing data. A GIS multi- 
variable file may be thought of as a cube made up of data cells. The horizontal axes of the cube 
correspond to geographic location. Along the vertical axes are the different variables in the multi- 
variable file. Thus, each column of cells in the cube has a unique geograplric location and each 
layer of cells in the cube corresponds to a unique thematic map variable, such as soil type, vege- 
tation type or topograpliic elevations (see Figure 1). Analyses which can be performed on the 
multivariable files include: slope and aspect calculations, proximity analyses and the creation of 
qualitative models based on user supplied weights (ESRI staff, 1979). However, statistical analy- 
ses, such as multiple regression, analysis of variance and spatial autocorrelation analysis can not 
be performed within the GIS. Further, since observations in a GIS multivariable file are stored 
as sixteen bit integers, real values and integers outside the range ±32,768 can not be manipulated 
by the GIS. 

The inability to perform statistical analyses on GIS multivariable files, precludes the develop- 
ment of powerful quantitative models for resource exploration or land use planning withun the 
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GIS package. However, statistical packages, such' as SAS, provide tools for performing a range of 
statistical analyses from the computation of simple descriptive statistics to coinple.v multivariate 
techniques (Helwig and Council, 1979), 

To provide a means of performing statistical analyses on variables in GIS flies and to allow 
the results of these analyses to be merged with GIS multivariable files, Interfaces between the 
ESRI GIS and real valued data flies were created. 

DESCRIPTION OF INTERFACES 

SAS2GIS is a computer program which converts gridded real valued data into the single 
variable flle (SVF) fonnat used by the ESRI geographic information system. The data file of 
real values that is to be converted to an integer SVF must be sorted so that for a SVF file n rows by 
m columns, the real value for cell (1, j) in the SVF will correspond to entry (i'''m+j) in the real val* 
ued data file. The conversion is accomplished by a linear transformation, y»ax+b, which maps a 
real valued variable X into an integer variable Y, The program provides the user with two op- 
tions for converting the data into integer format. First, the user may supply the a and b terms 
of y=a,v+b. Second the user may specify a range for the transformed data. The Original data will 
be mapped into this range using the linear transformation and the transformed data will be 
checked to insure that the conversion did not truncate the values more than a user specified 
amount. After the transformed data have been written to a disk file, an additional record is 
added to the file which contains the a and b terms that were used to make the transformation. 
This record is not read by any GIS software but may be used by the interface GIS2SAS to con- 
vert the SVF file to a real valued data file. 

GIS2SAS is a computer program which transforms integer data in ESRI’s SVF format into 
real valued data with row column references. The formula used to convert integer data to real 
valued data is x=(y-b)/a; where x is a real value, y is nn integer and a and b are constants. The 
user has two options for converting integers to real values. First, the user may specify a and b 



terms for the transforniaitoru This is required ifihe SVF file was not created by SAS2G1S. 
Second, if no a and b vain r- were provided by the user the program will use the a and b values 
in the last record written by SAS2GIS when the file was created. Program output consists of a 
real valued data file in a format specified by the user. 

RUNNING SAS2GIS AND GIS2SAS 

SAS2GIS and GIS2SAS are interactive computer programs written in FORTRAN IV, At 
Goddard Space Fliglit Center they run in the foreground on an IBM 360/91 and an IBM 360/75. 
Listings of these programs appear in Appendix 1 . The elists (files wliich contain TSO commands 
and subcommands) used to set up and run the interfaces are presented in Appendix 1. To run 
a program the user types the program’s name, SAS2GIS or GIS2SAS, followed by I ("input file- 
name”) and 0 ("output filename”). For example if the user types: 

SAS2GIS I(BOTANY.DATA) O(GIS.DATA) 

the program SAS2GIS will be run to create an SVF file GIS.DATA from a real valued data file 
BOTANY.DATA. 

When the programs are running in the foreground they will prompt the user for two lines of 
input. The first line contains parameters used to make the transformations and the second line is 
the format of the real valued data file. Tables 1 and 2 summarize the input parameters for these 
interfaces. 

Should the user wish to convert these interfaces to run in a batch environment, he must 
make two modifications. First, the elists must be replaced by JCL statements wliich assign disk 
files for input and output to logical units 8 and 10, as shown in Figures 2 and 3, Second, the 
write statements which prompt the user for input should be removed. 

COMBINING THE GIS AND A STATISTICAL PACKAGE FOR DATA ANALYSIS 

Some preliminary results in an analysis of gravity and elevation data from the Rio Grande 
rift are presented here to illustrate the utility of combining a package of statistical analysis 
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programs, SAS, with the ESP* geographic information system (CIS). The objective of tltis study 
was to remove the relationship between Bouguer gravity and topographic relief which was present 
in a data set of elevation and gravity observations compiled by Keller and Conrad at the Univers* 
ity of Texas. 

Hie gravitational field measured at a given point on the earth's surface includes effects unre* 
lated to geology, These effects are due to: variations in the distance between the eartli’s center 
and the station where the gravity field was measured and the contribution of local topography to 
the observed gravity (Grant and West, 1965). Once corrections for these effects have been ap- 
plied to the data, the resulting values are termed Bouguer gravity data and reflect the contribu- 
tion of underlying geologic structures. 

At present there is no agreement on the best method for computing Bouguer gravity from 
the observed gravity which is measured at a station. However, Nettleton (1940) has suggested 
that when the corrections to reduce observed gravity to Bouguer gravity are properly applied, the 
correiation between station elevation and Bouguer gravity should be low. A striking similarity 
between the Bouguer gravity and station elevation was first observed in three dimensional plots 
of these two data types produced by the CIS. In order to determine the degree of correlation 
between the gridded eievation data from the rift (Figure 4) and gridded Bouguer data (Figure 5), 
the statistical analysis package SAS76 was used to compute a Pearson product moment correla- 
tion coefficient. Tlie correlation between Bouguer gravity and elevation was Ivigh, -0.903, and a 
linear regression was computed to remove the variation in Bouguer gravity due to elevation. The 
residuals from the regression provide a better estimate of Bouguer gravity because the effect of 
station elevation has been removed. A plot of these residuals, Figure 6, reveals regions of high 
positive residuals in the North-west and South-east comers of the Rio Grande study area. This 
suggests that there are regional variations in rock density in the study area and that individual 
Bouguer gravity corrections sliould be calculated for each region. 
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Ill this analysis the Statistical Analysis System (SAS) and the ESRI GIS proved useful tools 
for examining the relationship between variables in a spatially referenced multivariable database. 

With these interfaces the researcher can utilize the extensive data base management and 
graphics capabilities of the GIS to complement existing software in the analysis of real valued 
multivariate data. 


Table 1 

Input Parameters f^or SAS2GIS 


Parameter 

Column 

Format 

Function 

First Input Line 

ROW 

1-5 

15 

Number of rows in the SVF wliich will be created 
from the input data fUe, 

COLUMN 

6-10 

15 

Number of columns in SVF. 

A 

11-15 

F5.0 

Multiplicative term in y=ax+b, used to map real x 
into integer j . 

B 

16-20 

F5.0 

Additive 'erm in y=ax+b, if A and B are omitted 
NEWMIK and NEWMAX must be specified. 

MIN 

26-30 

F5.0 

Minimum value of input data default: value cal- 
culated by the program. 

MAX 

26-30 

F5.0 

Maximum value of input data default: value cal- 
culated by program. 

NEWMIN 

31-37 

15 

New minimum value after transformation to 
integers. 

NEWMAX 

38-44 

15 

New maximum value after transformation to 
integers. 

TOLER 

45-49 

F5.0 

User specified tolerance for truncation resulting 
from conversion of real values to Integer format. 

MISVAL 

50-55 

F6.0 

Value indicating missing observations in the data. 

NEWVAL 

56-61 

F6.0 

Value which will be assigned to missing observa- 
tions in the SVF file, default: 0. 

Second Input Line 

FMT 

1-80 

80A1 

User supplied input format for data in real data 
set. 








Tabte 2 

Input Parameters for GIS2SAS 


Parameter 

Column 

Format 

1 

Function 

First Line of Input 

A 

1-6 

F6.0 

Divisor in x«(y-b)/a used to transform integer 
values into real values defaults use first value in 
last record of SVF file. 

B 

7-12 

F6.0 

b term in x*(y -b)/a used to transform integer 
values into real vsiues defaults use second value 
in last record of SVF file, 

Second Line of Input 

NCOL 

1-3 

13 

# columns in output file. 

Third Line of Input 

FMT 

1-80 

80AI 

User supplied output format for real valued 
data set. 
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Figure 1. Structure of an ESRI CIS Multivariable 
FUe (MVF) 
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Figure 2. Logical Units Required by SAS2GIS 
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Figure 3. Logical Units Required by GIS2SAS 



Figure 4. A Plot of Elevations in the Rio Grande Rift Study Area Produced by the ESRI GIS 
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Figure 5. Bouguer Gravity Values from the Rio Grande Rift Study Area 



Figure 6. Residuals from a Regression of Elevation and Bouguer Gravity 
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1 I ) 

STOP 2 
CCKTINJE 
RETURN 
END 


TOLERANC*^ 


EXCFERfn PRRINO TP A NS FO PRAT ION ' 


Al-S 



noo nnnnonnnonnnnnrsnnonoonnnonnonnnonnnnnnonnno 


B. GIS2SAS PROGRAM LISTING 




* 

* 

4 

t 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

* 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 


IXtLii; ihTE:i'?Ar:« HETI.kp.'J E.«m 1SC6FAPHIC 
INFUR'IATXCN '5VTTEM 3VF DATA 5RTS 
AND TEAL VAiriHD DATA 3ETS n,DED DY 
SIAIISTICAL PACKAGES 

PROGaA.»1.1£R: EOWAPD lASHCKA 

DATE; AUGUSI 1, 197*J 


THIS 

INTO 

1. 

2 . 

3. 


PliOGRAH TRA'lSiOPIS SIS 




IN IN TEG P” *2 
DA'*’ A IN 3 STEPS: 


TU5 U RIG INAL BEAL VAL'JFD 
READS ROW COLHEN INFCSNAtlCK PPCI PECOS D 1 


liUNSFOEKS I’n*J3*^S3 TO REAL VALUES DSIN’S 
A LINEAR TPA NSrOR'l ATIO'1 : 2 1 AL= { INTEGSP- '>) /A 


WtUlES THE REAL DA’^\ XO A FILE PITH A 
USER SUPPLIED FCPMAT 


VAfllADLE 

DATA 

BUWCUL 

ROW 

CUL0.MN 

AO 

SLOPE 

INltR 

livON 


FAT 


AND COLUMNS 


prj notion 

STORES PRAL DATA 
STORES !> CF ROWS 
TN TH'=' INPUT FILE 
# OF RONS IN CATA AND SIS FILE 
» COLUMNS IN CATA S SIS ’^II'* 
STOPES LINEAR T*^ANSFQRK ?APA**S 
SLCPR LIN EAR '^UNCTION DS»n 
TO i'J'ANSFOF 1 REAL DATA 


:ntebsept 

STnn7.<5 \ 

CATA 


Oi» 
J.O W 


LIN'^AR TPANS*^OP’^ 


• 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

W 

4 

4 

4 

4 

4 

4 

A 

* 

* 


US*='R 3NPPLI 
VALUED DATA 


THE TRANS ’^OPM ED ♦ 

■* 
4 

*.D PCRMA? FOR 4 

3 EX ♦ 

♦ 


444 44444***4444*4 4 *44 *4 4 444 4 44 44 4444444444 44 4444 444444* 

EEAI*« DATA (ISO) , A» (2) ,SLC"’C, INFER 
INTEGER42 RC.<CoL (2) , XR^' ' 

T \tmr TF .. A I ... Tt r r? S * - 


INTEGEb4N «lP,CLLU*'N,'^MT 2D) ,BLNK/' •/ 
LCGICALW1 HUfNr(»D) 

ECOIVALENCE (Aii(t),H 


SLOPS) , {AP (3) ^I'lIER) , (BUFFER ( 1) , FN’l 
READ USEE SUPPLIED OUTPUT FC-’RAX AND PARANETCRS 


(D) 


WRITE (b,N) 
*=■02 NAT ( UiO 


AT ( UiO.nO, ' PL'^ASE INPUT A 
IN .TIO.’Nu^Cd 


WILL «E USED in iaK 


(FS.D) D(**S.O)_ TEriS'/ 

T'lF TRANr.FO?N 'TICN V 


A 1-6 



oooon ooo nnn nan nan nnn 


1H . • (DEFAULI: IF 3IR0, ’’HD-IRA'l KILL EI?3 RFCOSD OF SVF) »/ 

1H-* ** n*/ 


OV 


. tH 

RFAC(5,11) SLCEE, INTER 
11 FORMAT t^fb.O) 

WRITE (o 2) ’ 

2 FORMAT (>• PLEASE INPrjT » 0 ^ COLOMSS «fOR OUTPUT DATA SET; (13)'/ 
. ' CEFAULr: ,IF 3ERC, PRCOrvAl HILL USE I COLUMNS IN SVF») 

REAC (5,6) KCCL 
6 FCFMATll!)) 
fcRITF (6,B) 


8 FORMAT(/» PLEASE 1?PUT PORT AT PUR REAL VALUED ^ata SET; (B0A1)', 
./inC,' 2EFALLT; LAST RECCSD OF 3VF HILL HE USED’ ) 


PAP (5, 1)FMT 
FORMAT (B0A1) 

READ A.tD SCFO » luJS ANC CntUMNS XU OXS FILP 

FEAC (B) ROWC( 

FCH=r.c»ic:uL^ij^ 


FEAC (B) ROWCOL 
C»iC:UL(1l 

CCLUMN«BCwCCL 12 ) 

L»OT*ri?iA, i\ lire? f**/Ni 


kRTTP/h HftJ rCMHMM 

3 FCRPAI (^HO/TloI 'ROiss' , "-22, X5/1 X /X 1 0 , • COlUMNS= •, T22 , 15) 

r \aitwvfc • 11 1> • V/ U ^ \|VJ 4 V* hv/ 

SPACE PAST TEARSFUSMEn P;\tA 
DC 30 IsI.RCH 

30 CALI READC (IROW, COLUMN, B) 

PEAD AUD ECHO PArATSTEIS nSFD TO TIANSPCRM OE13INAL REAL DATA 
BEAD (B) AB 

iJO WRlTEJo.5) SICPE«INTTP 

5 FORMAT (1H ,T10,^SLC?E==',T19,FB.3/1X,T10, 'INTERSEPTs' ,T21 ,r».3) 
CHECK FOE USER SPECIFIED FORIAT 
DC 50 1=1,20 

IF jFM'I(I) .N2. UIUK) 5C '^060 
50 CCNIIKUE ^ 

NONE FCUND CSS LAST RSCCRn CF SV? 

READ (y,£.ND = 99) FvIT 
50 FFtaUD B 


TFANSrOR?! XHE DATA tUTO nxiOXNAL VALUES 
A RCH A1 A HUB 

IF (NCuL .EC. 0) NCCL=CCLUMN 
FEAC (3) bUwCUL 
no 10 i=1,RCH 


Al-7 



onnnonnnn nnnnnooo nno 


CALL BEADC (IUOW,CCLn:iN,R) 

DO 20 o»i,uoLrjaFi 

20 DATA (J) « (Xscy (J)-I.wn)/3L0?S 

ynllE TUB HEAL VAtOBD DATA fILB 

10 CALL WilllEC {DAIA,HCCL,»aT,10) 

MBIT E (o , 1/) 

12 FOBHAT (/IliO, • + ♦■++ SOCCESSFOL COHVSBSIOH TO BFAL V LTED PILE 


KC ^^CSaAT SPP-CIFIFD' , 

« « I ^ 


'S'TC P 1 

99 WHnBj[o,100i 

100 Fcn«AT(i:to.iio, '♦*'**♦ fatal ffboh: 
DI USEH Ud k FOBMAT CM SV2 
STO P 2 
END 


♦ * 

* THI3 SUbECUTiNE BEADS A SVF FOBMAT FILE ♦ 

* '♦ 

*****■***■¥ * 

SUBBCUTIHE EEADC (IHCy , eCLOa K , lUN IT) 

INTEGEH *4 CCLUMH 
IMFGFR^Z IkCWiCOLflMN) 
nEACjLUMT) lEOtJ 
RETURN 
END 


t'^*^'^*** ****^^***i‘* ******'*'^**'* *** «¥'*i**^*^'*‘*^*'**i> 

* ♦ 

♦ THIS SUBROUTINE WRITES THE RE?L VALUED DATA * 

♦ yiTd a USER SPECIrlED FORMAT ♦ 

♦ + 

SUBRCUTINS fc&IIEC (DATA ,CCLUP.M ,?MT, L'J NIT) 

XNTEGFii*<4 COLUMN 

REAl*4 DATA (COLUMN) ,FM1 (20) 

NRITF (LUMT,FMT) DATA 

EETURN 

FND 


Al -8 



APPENDIX 2 
CLISTS 


A. GUST FOR SAS2GIS 

PROG 0 INFILE (REAL.DATA) OUTFILE (SAS.GIS) 
CALLOC DA (&INFILE.) F(FT08F001) 

ALLOC DA (&OUTFILE.) N SP (10, 1) TR U (GIS) 
CALLOC DA (&OUTF1LE.) F (FTlOFOOl) 

DO SAS2SVF LIB (PROG.LOAD) 


B. CLIST FOR GIS2SAS 

PROG 0 INFILE (SAS.GIS) OUTFILE (REAL.DATA) 
CALLOC DA (&INFILE.) F (FT08F001) 

CALLOC DA (&OUTFILE.) N SP (10. 10) TR U (FORT) 
CALLOC DA (&OUTFILE.) F (FTlOFOOl) 

ALLOC DA (*) F (FT05F001) 

ALLOC DA (*) F (FT06F001) 

DO SVF2SAS LIB (PROG.LOAD) 


A2-1 



APPENDIX 3 

AN EXAMPLE OF THE OUTPUT FROM THE INTERACTIVE 
PROGRAMS, SAS2GIS AND GIS2SAS. PROGRAM OUTPUT IS IN 
UPPERCASE AND USER INPUT IS IN LOWERCASE LETTERS. 


A. EXAMPLE OF RUNNING SAS2GIS 

$as2gis infile (real.data) outflle (elev2.svf) 

INPUT: ROW COLUMN A B MIN MAX NEW^MIN NEW MAX 
TOLERANCE MISS^VAL NEW VAL USING. 

(215, 2F5.0, 2F5.0, 2F7.0, F5.0, 2F6.0) 

5 ... 10 ... 15 ... 20 ... 25 . 

+ + + + + 


48 1 . 

0. 

ROWS = 

36 

COLUMNS = 

48 

SLOPE = 

1.000 

INTERSEPT = 

0.0 


PLEASE INPUT FORMAT OF REAL VALUED DATA 
(lOfS.O) 

-H-l- SUCCESSFUL CONVERSION TO ESRI SVF FILE -H-H-+ 
IH0002I STOP I 

CONDITION CODE = 01 


.. 30 ... 35 ... 40 ... 45 ... 50 ... 55 ... 60 
+ + + + + + 


B. EXAMPLE OF RUNNING GIS2SAS 
gis2sas infile (elev.svf) outfile (realdata) 

PLEASE INPUT A (F5.0) B (F5.0) TERMS WHICH WILL BE USED TO MAKE 
THE TRANSFORMATION 

(DEFAULT: IF ZERO, PROGRAM WILL USE LAST RECORD OF SVF) 

A B 

1 . 0 . 

PLEASE INPUT # OF COLUMNS FOR OUTPUT DATA SET: (13) 

DEFAULT: IF ZERO, PROGRAM WILL USE # COLUMNS IN SVF 

PLEASE INPUT FORMAT FOR REAL VALUED DATA SET: (80A1) 

DEFAULT: LAST RECORD OF SVF WILL BE USED 
(lOfS.O) 


A3-I 



ROWS « 36 

COLUMNS « 48 

SLOPE* l.OOO 

INTERSEPT * 0.0 

++4-f SUCCESSFUL CONVERSION TO REAL VALUED FILE -H-H- 

1H0002I STOP 1 

CONDITION CODE * 01 
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